Predicting Query Performance Directly from Score Distributions

نویسنده

  • Ronan Cummins
چکیده

The task of predicting query performance has received much attention over the past decade. However, many of the frameworks and approaches to predicting query performance are more heuristic than not. In this paper, we develop a principled framework based on modelling the document score distribution to predict query performance directly. In particular, we (1) show how a standard performance measure (e.g. average precision) can be inferred from a document score distribution. We (2) develop techniques for query performance prediction (QPP) by automatically estimating the parameters of the document score distribution (i.e. mixture model) when relevance information is unknown. Therefore, the QPP approaches developed herein aim to estimate average precision directly. Finally, we (3) provide a detailed analysis of one of the QPP approaches that shows that only two parameters of the five-parameter mixture distribution are of practical importance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the Ability of Score Distributions to Model Relevance

Modelling the score distribution of documents returned from any information retrieval (IR) system is of both theoretical and practical importance. The goal of which is to be able to infer relevant and nonrelevant documents based on their score to some degree of confidence. In this paper, we show how the performance of mixtures of score distributions can be compared using inference of query perf...

متن کامل

Do Clarity Scores for Queries Correlate with User Performance?

Recently the concept of a clarity score was introduced in order to measure the ambiguity of a query in relation to the collection in which the query issuer is seeking information [CronenTownsend et al. Proc. ACM SIGIR2002, Tampere Finland, August 2002]. If the query is expressed in the “same language” as the whole collection then it has a low clarity score, otherwise it has a high score, where ...

متن کامل

Using Models of Score Distributions in Information Retrieval

Empirical modeling of a number of different text search engines shows that the score distributions on a per query basis may be fitted approximately using an exponential distribution for the set of nonrelevant documents and a normal distribution for the set of relevant documents. This model fits not only probabilistic search engines like INQUERY but also vector space search engines like SMART an...

متن کامل

SnippetGen: Enhancing the Code Search via Intent Predicting

To enable the cod sarch results to run immediately without any subsequent modification, an intent-enhanced code search approach (IECS) is proposed. It has the ability of intent predicting to guess what else a user might do after obtaining the search results. Based on the intent-relevant semantic and structural matches, IECS improves the performance of code search by incorporating the intent for...

متن کامل

Standard Deviation as a Query Hardness Estimator

In this paper a new Query Performance Prediction method is introduced. This method is based on the hypothesis that different score distributions appear for ‘hard’ and ‘easy’ queries. Following we propose a set of measures which try to capture the differences between both types of distributions, focusing on the dispersion degree among the scores. We have applied some variants of the classic stan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011